awscli
sudo apt install awscli
aws configure
aws s3 ls s3://<your bucket name>
s3://mybucket/myfolder/myother_folder/
files.txt
you can run with input_dir=’/path/to/files.txt’
FASTDUP_S3_ENDPOINT_URL
environment variable to the endpoint’s IP address, by either:
input_dir
as the local folder location of the copied data. The explanation above is for the case the dataset is larger than the local disk (and potentially multiple nodes run in parallel).
minio://google/mybucket/myfolder/myother_folder/
minio://
prefix).files.txt
you can run with input_dir=’/path/to/files.txt’